PCA in Data-Dependent Noise (Correlated-PCA): Nearly Optimal Finite Sample Guarantees

نویسندگان

  • Namrata Vaswani
  • Praneeth Narayanamurthy
چکیده

We study Principal Component Analysis (PCA) in a setting where a part of the corrupting noise is data-dependent and, as a result, the noise and the true data are correlated. Under a bounded-ness assumption on the true data and the noise, and a simple assumption on data-noise correlation, we obtain a nearly optimal sample complexity bound for the most commonly used PCA solution, singular value decomposition (SVD). This bound is a significant improvement over the bound obtained by Vaswani and Guo in recent work (NIPS 2016) where this “correlated-PCA” problem was first studied; and it holds under a significantly weaker data-noise correlation assumption than the one used for this earlier result.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated

Given a matrix of observed data, Principal Components Analysis (PCA) computes a small number of orthogonal directions that contain most of its variability. Provably accurate solutions for PCA have been in use for a long time. However, to the best of our knowledge, all existing theoretical guarantees for it assume that the data and the corrupting noise are mutually independent, or at least uncor...

متن کامل

Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm

This work provides improved guarantees for streaming principle component analysis (PCA). Given A1, . . . ,An ∈ R sampled independently from distributions satisfying E [Ai] = Σ for Σ 0, this work provides an O(d)-space linear-time single-pass streaming algorithm for estimating the top eigenvector of Σ. The algorithm nearly matches (and in certain cases improves upon) the accuracy obtained by the...

متن کامل

Finite Sample Approximation Results for Principal Component Analysis: a Matrix Perturbation Approach

Principal Component Analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the non-asymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, to those of the limiting population PCA as n → ∞. As in machine learning, we pr...

متن کامل

Finite Sample Approximation Results for Principal Component Analysis: a Matrix Perturbation Approach1 by Boaz Nadler

Principal component analysis (PCA) is a standard tool for dimensional reduction of a set of n observations (samples), each with p variables. In this paper, using a matrix perturbation approach, we study the nonasymptotic relation between the eigenvalues and eigenvectors of PCA computed on a finite sample of size n, and those of the limiting population PCA as n→∞. As in machine learning, we pres...

متن کامل

Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time

Sparse principal component analysis (PCA) involves nonconvex optimization for which the global solution is hard to obtain. To address this issue, one popular approach is convex relaxation. However, such an approach may produce suboptimal estimators due to the relaxation effect. To optimally estimate sparse principal subspaces, we propose a two-stage computational framework named “tighten after ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1702.03070  شماره 

صفحات  -

تاریخ انتشار 2017